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1.  Introduction  and  Summary. 

Wilcoxon  [1945]  intioduced  the  two-sample,  rank-simi  test  for  testing  the 
difference  in  locations  for  two  populations.  Consider  m  observations,  K^,..., 
from  an  X-population  and  n  observations,  from  a  Y-population,  all 

observations  being  independent.  If  F(u)  «  P(X  <  u)  is  the  cumulative  distri¬ 
bution  function  (c.d.f)  of  the  X-population  and  G(u)  =  P(Y  <  u)  is  the  c.d.f. 
of  the  Y-population,  the  hypothesis  tested  is  H^:F(u)  H  G(u)  versus  the  al¬ 
ternative  (one-sided  for  illustration),  H  :F(u-a)  H  G(u),  a  >  0.  The  procedure 
is  to  rank  all  observations  in  joint  array  yielding  sets  of  ranks  r^^,...,  r^  and 
s^,...,  s^  corresponding  to  the  observations  and  to  obtain  the  sum  of  ranks,  say 
Es  for  the  Y-sample.  Let  C(m,n,a)  be  the  smallest  positive  Integer  for  which 
P(Es  >  cJh^)  <  a;  note  that,  under  H^,  all  configurations  of  X*s  and  Y's  in  the 
joint  array  are  equally  likely.  When  Es  >  C,  is  rejected  in  favor  of  and 
the  significance  level  is  a.  Much  has  been  written  about  the  rank-sum  test  and 
various  tables  of  values  of  C  have  been  prepared.  An  excellent  bibliography  on 
nonparametrlc  tests  has  been  prepared  by  Savage  [1962], 

Lehmann  [1953]  has  proposed  an  alternative  to  H  different  from  that 

—  k  ° 

above  using  G(u)  -  F  (u) .  Thus,  given  this  model,  we  write  H^tk  *  1  and 

H  :k  >  1.  Values  of  k  >  1  lead  to  a  change  in  location  of  the  Y-population  but 

£L 

also  to  changes  in  shape  of  G(u)  relative  to  F(u).  Savage  [1956]  further  dis¬ 
cussed  the  implications  of  the  Lehmann  model.  The  basic  advantage  of  the  model 
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is  that  it  permits  relatively  easy  calculation  co':  the  probabil-ity  of  each 
particular  configuration  of  X's  and  Y's  in  the  joint  array  under  the  alternative 
hypothesis.  Any  such  configuration  may  be  defined  by  the  ranks  assigned  to  the 
Y-sample  and  the  result  is  that 

P(s^,.. .,s^|m,n,k)  - 

interpretable  given  that  s  ,  ,  »  m  +  n  +  1,  1  <  g,  <  , , .  <  s  <  lu  +  n.  It  is 
an  additional  property  of  the  model  that 

p  «  P(X  <  Y)  »  k/(k  +  1) .  (2) 

The  Lehmann  model  permits  the  development  of  two-sample,  sequential  rank 
tests.  Initial  work  on  this  problem  was  done  by  Wilcoxon,  Rhodes  and  Bradley 
[1963]  and  is  summarized  in  the  next  section.  The  standard  sequential 
probability-ratio  method  of  Wald  [1947]  was  used  for  pairs  of  samples  of  sizes 
m  and  n  taken  sequentially,  the  ranking  effected  within  each  group  of  two 
samples.  In  this  paper  we  examine  certain  extensions  of  that  sequential  pro¬ 
cedure.  In  addition,  the  Lehmann  model  may  not  be  the  one  desired  in  practice 
and  departures  from  the  model  are  considered.  At  the  time  of  preparation  of 
this  paper,  much  of  the  research  was  still  in  progress  and  results  given  here 
are  in  some  cases  preliminary  and  conclusions  somewhat  tentative. 


n 


r(s^  +  Jk 


j  -  1  r(Sj  +  -  i)  Tisp 


<1) 


2.  Two  Sequential  Two-Sample  Grouped  Rank  Tests. 

Let  a  group  of  observations  consist  of  ra  X-observations  and  n  Y-observations 
as  discussed  above.  In  Wald  sequential  analysis,  v/e  shall  take  the  group  as  the 
requisite  unit  to  be  taken  sequentially.  For  each  group  the  probability  ratio 
is  required  and  obtainable  from  (1).  This  ratio  for  the  y-th  group  is 

r  (m,n,kj^,l)  «  (nt  +  n)  i  n  ^^1  ~  (3) 

^  r(s,  )  j  =  1  r(s.  ^  m  -  j) 

1,y'  ''  j  +  1,Y  -^1 


s  is  the  rank  of  the  j-th  Y  in  the  y-th  group  and  (1)  is  used  in  the 
J  >Y 

i  under  the  null  hypothesis  H  and  in  the  numerator  with 


wheri 

denominator  with  k 

k  •  kj^  >  1  xmder  Note  that  this  probability  ratio  is  dependent  upon  the 

configuration  of  X's  and  Y's  in  the  joint  array  for  the  y-th  group  and  we  have 
designated  the  sequential  test  based  on  it  as  the  configural  rank  test.  If  one 


3 


is  at  the  t-th  stage  of  a  sequential  configural  rank  test,  the  test  statistic  is 
t 

Y  *  1 

in  the  notation  of  Wald.  Suppose  that  a  has  been  specified  as  the  probability  of 
a  Type  I  error  and  P  as  that  for  a  Type  II  error.  Then  the  sequential  decision 
procedure  is  to 

(i)  Terminate  the  test  with  the  rejection  of  (acceptance  of  H^)  if 
f>j.t/>ot  -  ^ 

(ii)  Terminate  the  test  with  the  acceptance  of  if  ®  “ 

P/-(  1  -  or 

(iii)  Consider  anotHer  group  of  observations  if  A  <  <  B. 

Wilcoxon  has  suggested  a  simple  algorithm  for  the  computation  of  (1)  and  then  (3) 
and  (4)  which  is  explained  and  Illustrated  by  Wilcoxon,  Rhodes  and  Bradley  [1963]. 


A  second  two'Sample  grouped  sequentiu.1  rank  test  is  based  on  the  within- 

group  sums  of  ranks  for  the  T-sample.  Let  be  that  rank  sum  for  the  y-th 

group,  S  «  E  s,  .  Now,  for  given  k, 
y  j  a  1  JjV 


n 


PC  S  =  S^Jm,n,k) 


j 


I 


l<s,  <....<s  <m  +  n 
-  n,Y  - 

n 

£  s 


j  »  1  Y 

where  the  argument  of  the  sum  comes  from  (1) .  The  probability  ratio  statistic  for 

the  Y“th  group  for  the  sequential  rank-sum  test  is 

n  n 

R  (m,n,k  ,1)  »  P(  £  s  =  S  lm,n,k.)/P(  £  s  =  S  |m,n,l).  (6) 

y  ^  j  a  1  3,Y  Y  i  j  „  1  j.Y  Y 

At  the  t-th  stage  of  the  sequential  rank-sum  test,  the  appropriate  probability- 
ratio  statistic  is 


^  ,  R  Cm,n,k  ,1) 
Y  •=  1 


(7) 


and  the  decisions  noted  above  for  p,,,/p  apply  to  ,,  also.  The  seqi^entlal 

JLb  XW  ww 
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vank-sum  test  is  easier  to  apply  than  the  configu}:al  tank  test  when  tables  are 

available.  Both  tests  are  facilitated  through  the  use  of  logarithms  on  both  the 

probability  ratios  and  the  sequential  bounds.  Tables  are  given  in  the  reference 

for  m  «  n  »  1(1)9^  k-  «  1.5^  2.33,  4  and  9  showing  S  ;T  ,  the  corresponding  rank- 

X  Y  Y 

sum  for  the  l«sample;  in  (5);  and  In  from  (6).  The  parameter  p  of  (2) 
corresponding  to  values  of  has  values  .6,  .7,  .0  and  .9.  Given  for  the 
Y-th  group,  it  is  now  possible  to  go  directly  to  In  to  sinn  such  values  for  t 
groups,  and  to  compare  the  sum  with  In  A  and  In  B. 

The  Lehmann  model  will  be  strange  to  most  users  of  the  method  and 
interpretation  Is  necessary  for  sensible  choice  of  k^.  Some  insight  comes  from 
the  corresponding  values  of  p  and  additional  help  results  from  considering 
the  change  in  location  in  terras  of  standard  deviations,  for  the  situation  where 
F(u)  is  a  normal  c.d.f.  Values  of  for  values  of  kj^  chosen  are  .232,  .658, 
1.029  and  1.485  respectively.  The  standard  deviation  of  the  Y-population  in 
this  normal  case  decreases  as  k  increases  and  values  are  .701  when  k  «  4  and 
,598  when  k  «  9  as  fractions  of  the  standard  deviation  for  the  X-population# 


Properties  of  these  two  sequential  rank  tests  follow  from  results  of 
Wald*  It  follows  that  the  processes  reach  a  decision  with  probability  one. 
Average  Sample  Numbers  (A.S.N.^s)  and  Operating  Characteristic  Functions 
(OvC.^functions)  have  been  evaluated  and  tabulated.  Selected  results  will  be 
given  in  tables  below.  It  appears  that  the  rank-sum  test  is  almost  as  good  as 
the  configural  rank  test  and  is  easier  to  use. 

3.  Modified  Sequential  Rank  Tests. 

It  appears  intuitively  that  better  rank  tests  might  be  obtained  if 
complete  reranking  of  the  totality  of  X-  and  Y-observations  were  effected  at 
each  stage  of  a  sequential  process.  Such  a  procedure  has  considerable  theoreti¬ 
cal  interest  although  practical  considerations  are  likely  to  dictate  within-group 
ranking  in  most  applications.  Merchant  [1962]  working  with  Wilcoxon  and  Bradley 
considered  this  problem. 

Suppose  that  X-  and  Y-observatlons  are  taken  in  pairs  corresponding  to 
the  situation  with  m  ■»  n  •  I  above  and  with  no  group  or  pair  effect  present. 

Then,  at  the  t-th  stage  of  such  a  process,  observations  Xj^, and 
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are  ordered  in  joint  array.  A  modified  configural  rank  test  would  be  based  on  the 
statistic 


ot 


r{t,t,k  1)  «  k/  (2t)i 

^  *  T-V  ✓  _  \ 


t 

Tt 


r(s,  + 


-  j) 


r(s^) 


j  »  1  r(s. 


+  1 


-  j) 


(8) 


from  (3)  and  a  modified  rank-sum  test  would  be  based  on 


(9) 


from  (7)  and  (6).  Difficulties  in  theory  now  enter  since  successive  values  of 

ln(P^  /P  or  of  ln(P.  /P  )  may  be  regarded  no  longer  as  sums  of  independent 
Xt  Ob  it  ot 

random  variables  and  major  assumptions  for  Wald*s  sequential  analysis  do  not  hold. 
Additional  difficulty  in  applications  may  occur  with  the  modified  rank-sum  test 
since  t  may  exceed  the  maximum  tabular  value  of  nine  and  then  values  of  In  R  will 
not  be  easily  available.  The  Wilcoxon  algorithm  will  assist  in  the  use  of  the 
modified  configural  rank  test. 

It  was  decided  to  proceed  with  the  modified  sequential  rank  tests  as 
though  the  Wald  bounds  A  and  B  were  still  appropriate  for  ^It^^ot’ 

Monte  Carlo  studies  reported  in  part  below  suggest  that  this  is  appropriate. 

Berk  [1962]  has  also  worked  on  these  modified  sequential  rank  tests  at  Harvard 
University  and  has  reported  that  he  has  shown  that  the  ranks  at  the  t-th 

stage  are  sufficient  for  the  first  t  rankings.  This  work  and  that  by  Hall  [1962] 
is  enough  to  justify  continued  use  of  the  bounds  A  and  B  and  the  statistics  of 
(S>  and  (9). 

4*  Monte  Carlo  Results. 

Since  Wald  formulas  for  A.S.N.-  and  O.C . -functions  are  not  applicable  for 
the  modified  sequential  rank  tests ^  Merchant  proceeded  with  Monte  Carlo  studies 
on  the  IBM-709  computer.  These  studies  were  done  only  for  the  modified  con¬ 
figural  rank  test  and  the  method  was  as  follows. 

An  odd  integral  value  of  the  parameter  k  was  chosen  and  (k  -f  1)  random 
standard  normal  deviates  were  generated  through  use  ot  a  subroutine  that  produces 
these  in  pairs.  For  each  such  set^  the  first  deviate  was  taken  as  an  X-observation 
and  the  largest  of  the  k  remaining  deviates  was  taken  as  the  Y-observation.  In 
this  way  the  Lehmann  model  was  satisfied  but  for  the  special  case  with  F(u)^  a 


/ 
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standard  normal  c •<!*£.  As  each  pair  X-  and  Y-observations  were  generated, 

the  totality  of  X-  and  Y-observations  were  reranked,  the  logarithm  of  the  statistic 

of  (8)  was  computed,  and  the  value  so  obtained  was  compared  with  In  A  and  In  B. 

For  this  study.  Merchant  took  a  »  p  ■  ,05  and  hence  In  A  «  2,944  and  In  B  “  -2.944, 

The  modified  configural  rank  test  was  simulated  wiuh  each  experiment  carried  to 

a  decision  with  H  :k  »  1  and  for  values  of  k.  in  H  :k  *»  k,  >  1  with  k-  «  i.5. 

o  1  a  1  1  ^ 

2,33,  4  and  9  for  true  values  ofkofl,  3,  5,  7  and  9.  This  procedure  permitted 
at  least  crude  graduation  of  both  the  A. S .N. -functions  and  the  0,C. -functions. 
Values  of  the  A. S .N . -function  were  computed  as  the  average  number  of  trials  re¬ 
quired  for  a  decision  as  500  simulated  experiments  were  conducted  for  each  true 
value  of  k  for  each  sequential  design.  Results  for  the  values  of  k  indicated  are 
shown  in  Column  2  of  Table  1  for  the  sequential  design  with  kj^  =«  4  to  indicate  the 
nature  of  results  obtained  and  ail  A,S .N. -values  are  in  terms  of  nuH^Ders  of  obser¬ 
vations  taken  from  each  population.  In  the  same  way  values  of  the  O.C .-function 
are  in  Column  2  of  Table  2;  these  values  are  simply  the  proportions  of  sets  of 
500  experiments  that  led  to  the  acceptance  of  Note  that  the  empirically  ob¬ 

tained  value  of  a  is  .034,  less  than  the  nominal  value  of  .05;  in  general  it 
appears  that  true  values  of  a  and  p  are  less  than  the  nominal  ones  of  the  se¬ 
quential  design.  It  is  also  observed  that  the  Wald  method  appears  to  be  appropri¬ 
ate  on  the  basis  of  the  Monte  Carlo  method. 

Wilcoxon,  Rhodes  and  Bradley  gave  A.S.N.-  and  0*C, -values  using  Wald*s 
formulas.  Examples  are  shown  for  the  grouped  rank  tests  with  m  «  n  »  4  for 
configural  and  rank-sum  tests  in  Columns  3  and  4  respectively  of  Tables  1  and  2. 
Values  shown  are  sparse  but  suggest  a  discrepancy  between  these  values  and  those 
obtained  by  Merchant  for  the  supposed  better  method.  It  turns  out  that  the  Wald 
formulas  underestimate  the  A. S ,N. -values  and  Monte  Carlo  results  based  on  sets 
of  500  experiments  for  the  rank-sum  test  are  given  in  Columns  5.  The  comparison 
of  the  modified  tests  and  the  grouped  tests  are  confounded  by  the  fact  that  re¬ 
sults  in  Columns  2  are  for  the  modified  configural  rank  teat  and  in  Columns  5 
for  the  grouped  rank-sum  test.  We  do,  however,  believe  that  these  comparisons 
are  indicative  of  the  theoretical  advantage  of  the  modified  tests. 

Since  it  is  often  thought  desirable  to  consider  a  model  wherein  X-  and 
Y-population  differ  only  in  locations,  sampling  studies  were  also  made  with 


VALUES  OF  THE  A. S .K . -FUNCTION 


Actueil  true  value  of  k  is  1.96.  Based  on  1000  experiments  --  remainder  of  tables  based  on  500  experiments. 
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F(u) ^  the  standard  nonnal  c.d.f.  and  G(u),  the  c.d.f.  £cr  a  nornial  population 
with  iJilt  variance  but  mean  at  p  .  The  values  taken  for  p  were  those  for  the 

-  k  y  y 

mean  if  G(u)  ~  F  (u) .  Thus  in  the  example  of  Columns  6  of  Tables  1  and  2  was 
taken  to  be  0,  , S64,  .846,  1.029,  1.163,  1.352  and  1.485  corresponding  to  values 
of  K  of  1,  2,  3,  4,  5,  7,  and  9  respectively.  It  is  to  be  noted  that  the  A.S.N.- 
numbers  are  somewhat  higher  for  the  translation  experiments  and  that  the  O.C.- 
numbers  are  also  higher.  In  particular  for  data  fitting  the  Lehmann  model,  the 
observed  6  »  .026,  less  than  the  nominal  .05  while  for  the  data  fitting  the 
translation  model  the  observed  3  =  .060.  For  practical  purposes  it  is  not 
thought  that  the  method  is  too  dependent  on  applications  meeting  the  Lehmann 
model. 

In  order  to  obtain  as  much  information  as  possible  from  the  Monte  Carlo 
studies,  truncation  of  the  process  was  also  considered.  The  final  columns  of 
Tables  1  and  2  show  results  obtained  with  a  forced  decision  after  five  groups, 
m  «  n  •  4,  for  the  grouped  rank-sum  test.  For  those  experiments  not  already 
terminated  after  five  groups,  was  accepted  when  the  logarithm  of  the  proba¬ 
bility  ratio  in  (6)  was  negative  and  H  was  accepted  when  it  was  positive;  this 

Si 

appeared  to  be  an  acceptable  rule  due  to  the  symmetry  present  with  Ot  P  =  .05. 
From  Table  1  it  is  seen  that  savings  resulted  for  middle  values  of  k  particularly 
(compare  Columns  5  and  7)  and  from  Table  2  note  that  neither  a  nor  3  (.05  and 
,046  respectively)  were  seriously  inflated. 

The  results  of  Tables  1  and  2  have  been  selected  to  show  effects  that  we 
believe  to  be  indicative  for  a  sequential  system  with  realistic  values  of  cc,  P, 
kj^  and  m,  n.  Other  Monte  Carlo  results  have  been  obtained  but  investigations  are 
continuing.  It  is  expected  that  complete  results  will  be  reported  at  a  later 
date. 

5.  Remarks 

It  is  perhaps  not  appropriate  to  make  many  additional  comments  in  this 
paper.  The  grouped  sequential  rank  tests  will  be  available  in  the  cited  refer¬ 
ence  well  before  the  Fifth  intematioual  Biometric  Conference  takes  place. 

Results  obtained  by  Merchai.it  are  in  preparation  for  publication,  Monte  Carlo 
studies  are  still  in  progress  with  this  work  largely  being  done  by  Donald  C. 
Martin  working  with  Wilcoxon  and  the  present  author.  We  conclude  simply  by 
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noting  that  illustrativti  examples  of  the  grouped  sequential  rank  tests  are  given 
by  Wilcoxon,  Rhodes  and  Bradley  [1963]  and  we  believe  that  these  examples  are 
typical  of  applications  of  these  methods  that  may  usefully  be  made. 
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